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ABSTRACT 

In this paper, we take a quick look at some results that have been worked out for a kind of duality theory for 
nonlinear programming problems. These are quite parallel to those in the linear programming case. 
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INTRODUCTION 

The general nonlinear programming maximization problem is 
Maximize II(X) = f(X) 

Subject to h(.x)<0 (1) 

And X >0 . 

Where, as usual h(x) = \h l (X)]' . the associated Lagrangian function, used primarily in 
development of Kuhn-Tucker [18] conditions and in the discussion of saddle points, is 

L(X,Y) = f(X)-Y[h(X)] (2) 

Where Y = [jj,..., y m ] . The gradient of L(X,Y) with respect to the y’s is just VL y = —hi X ), and so f(X) in 
(1) could be expressed as 

f(X) = L(X,Y) + Y'[h{X)] = L(X,Y)-Y\L y 

In addition, the requirements in h(X) < 0 can be expressed as VL y > 0, so the general nonlinear programming 
maximization problem in (1) could be also be written 

Maximize L(X,Y) — YVL y 

Subject to —VLy <0 (3) 

And X >0 

This appears only to add complexity to the expression in (1), but it does suggest a “symmetric” problem: 
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Minimize L(X,Y) — X\L x 

Subject to — V L x > 0 (4) 

And Y> 0 

From L(X, Y) in (2), and using the Jacobian matrix 

~[Vh 1 ]' 

J = 

[Vh m ]' 

We have 

VL v = V/' - J'Y 

And so (4) in more detail is 

Minimize A (X ,Y) = f (X) -Y'[h(X)]~ X'[Vf - J'Y] 

Subject to J'Y -Xf > 0 (5) 

And F >0 

Written out in more detail this is 

m n m 

Minimize f(X)-^y i h i (X)^ j X j (f j ^J j y i h! j ) 

i = 1 7=1 i=l 

m 

Subject to f{X)-'Y j y i h‘ j -f j > 0 (j = 1, ,n) (6) 

i = 1 

And y t > 0 (i = 1, ,m) 

When f(X) in the maximization problem is concave and each constraint is convex or quasiconvex, so that both 
necessary and sufficient conditions for a maximum to the nonlinear programming problem, the (6) [or( 5) is taken to be 
the dual to (1) or (3)]. This is primarily because (I) a pair of dual linear programs corresponds precisely to (3) and (4), and 
(II) a set of theorems parallel to those in duality theory for linear programs can be derived. 

Regarding the linear programming connection, the general linear programming maximization problem was 
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Maximize P'X 
Subject to AX < B 
And X > 0 

The associated Lagrangian function, as in (2) would be 

L{X ,Y) = P'X -Y\AX - B) (7) 

WL X =[P-A'Y ] 

P'X - Y'[AX -B]-X'[P- A'Y] 

AX < BOX > 0 

And soVL, =— [AX — B] and V L x =[P — A Y]. (The latter gradient is defined as a column vector to 

conform to the convention of expressing gradient as columns.) The primal linear program can now be written in the style of 
(3) as 

Maximize P'X - Y'[AX - B] + Y'[AX - B] 

Subject to AX — B < 0 or AX < B (8) 

And X > 0 

Corresponding to (4) we have, for this linear program. 

Minimize P'X - Y'[AX - B] - X'\P - A'Y] 

Or, since P'X = X'P,Y'B = B'Y and Y'AX = X'A'Y 
Minimize B Y 

Subject to -P + A'Y > 0 or A'Y > P 
And F > 0. 

Clearly, (7) and (8) are exactly a pair of dual linear programs. 

We now explore a set of primal-dual theorems for the nonlinear programming problems in (1) and (5). We use a 
“prime” for these in the nonlinear case; there are obvious parallels to the results of linear programs. 

CONDITION FOR THE PRIMALAND DUAL PROBLEM TO HAVE FESIABLE SOLUTION 

Theorem 1 

Feasible solutions to the primal and dual problems are optimal if and only if, //( X p ) = A(X d Y d ) . 

Clearly, if II = A , then both objective functions have reached their limits and so the solutions are optimal. An 
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important outcome of the proof is t6haty, given an optimal X* for the primal, a vector Y* can be found such that (X*. Y*) 
is an optimal solution to the dual. 

Theorem 2 

A pair of feasible solutions has II (X*) = A(X*, Y*) if and only if (I) (Y*) '[-h(X *)] = 0 and (II) 
(X*)'[V/*-(/*)T*] = 0. 

Note that [— h(X*)] in (I) is the vector of slack variables for the constraints in (1), and [ V/ *— in 

(5). So this theorem describes a kind of “complementary slackness" for optimal solutions that is parallel to that in the linear 
programming case. This property of the optimal solutions to the pair of dual nonlinear programs shows the equivalence of 
the dual variables (Y) and Lagrange multipliers the development of Kuhn-Tucker conditions and the saddle point problem 
connection. 

Theorem 3 

Under certain conditions, dll (X*) / r)b t — y. . This marginal valuation property of the dual variables at optimum 
held for linear programs, subject to the qualification that the appropriate derivatives existed. The same sort of requirement 
is needed here. Under those conditions y. , is a measure of the impact on the4 optimum value of the primal objective 
function of a marginal change in Z? . 

The Solution of Nonlinear Equations 

There are many numerical methods which exist for locating the roots. We present on simple method here. The 
method presented here is called Newton’s method and is motivated as follows. 

We assume that f has continuous second derivatives and that some estimate X l of a solution to 

r oo=o (9) 

Is a available. If no such estimate is known, X 1 is chosen at random. If X, is a reasonably good estimate, the 
Taylor series expansion of f' about X 1 can be approximated as: 

f\x ) = f\x x ) + {x-x x )f"(x i). 

Hence if x is a solution to (9), 

0 = / , (x 1 ) + (x-x 1 )/'(x 1 ) . 

x = x l -f\x l )/f\x l ) (10) 

Now unless f is a quadratic, x will not in general be an exact solution to (9). However, x can be used as an 
improved estimate. Indeed, (10) can be looked upon as the first equation in a family which generates successive improved 
estimates of a solution to (9). The family has the following general form: 
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X 



n + 1 



AO 

rco/ 



n = 1,2,... 



(11) 



Once an estimate is finally found which is sufficiently close to a root, a new starting point can be selected in an 
effort to find a new root. This procedure is repeated until all roots are found. 

Multidimensional Optimization with Equality Constraints 

This problem can be stated in general as follows 

Maximize f(X) (12) 

subject to gj(X) = 0. j = 1,2 m (J3) 

Where 



X = (x 1 ,x 2 ,....x n ) T 

Consider the problem: 

Maximize f(X) = xf + 2(x 2 - 4 ) 2 + 8 

x 1 = x 2 - 4. 

We are left with following unconstrained problem in one dimension: 

Maximize f(x x ) = (x\ - 4 f + 2(x 2 - 4 f + 8, 

Which is easier to solve. Of course, this approach of elimination will be successful in reducing the number of 
variables in the problem only if it is possible to express a solution for one or more of the variables explicitly. Often, 
however, this cannot be done. 

The Jacobian Method 

We now present a method which solves the problem (12), (13). It is assumed that f and gj,j — 1,2,3, , III. 

has continuous second derivatives. The strategy is to find a suitable expression for the first derivatives of f at all points 
which satisfy (13). The feasible stationary points of f are the ones among these for which 

^ =0 i = 1,2,..., n (14) 

OX i 



The maximum points are identified among those satisfying (14). 

These ideas are now placed on a firm mathematical basis. Consider any point X which satisfies (13). In any 
neighborhood of X there will exist at least one point X+li which satisfies (13), because X is on the boundary of the region 
defined by (13). Expanding/and g ■ , j — 1,2,..., m , in a Taylor series about X, we get 
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f(X + h) = f(X) + Xf(X) T h + i hH f (6X+( 1 - 9){X + h))h, 

g j (X + h) = g j (X) + 'Vg j (X) T h + ^hH g X0X + (l-9)(X +h))h, j = l2,...,m, 

For some 9, 0 < 9 < 1 .As X+h approaches X, we get 

f{X + h)~f{X) + Vf{X) T h 

gj(X +h) ~ gj(X) + Vgj(X) T h j = 1,2,.,, m 

Therefore 

df(X)~Vf(X) T dX 

d gj (X)~V gj (X) T dX j = 

Using (13) we get 

dgj(X) = 0 j = 1,2,..., m. 

Thus we can state, to within a first order approximation 

Vg.(X) r aX=0, j = l,2,...,m (15) 

Now as Xf(X) and Vg j(X), j = 1,2,..., m consist of known constants, (15) constitutes a set of (m+1) linear 

equations in (n+1) unknowns, dx 1 ,dx 2 ,dx 3 ,...,dx m ,df(X). If the equations are linearly dependent one discards the 

smallest number whose removal leaves an independent set. Hence we can assume that there are no more equations than 
variables, i.e. 

m < n 

Now 

m=n 

Leads to the unique solution 

dx=o 

Which implies that there are no feasible points other than X in any neighbourhood of X. That is, the set of feasible 
points is discrete. Hence, we can assume that 

m<n. 

We redefine X — (.tfj , X 0 , X n ) T as 
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X = (w 1 ,w 2 ,..., W m ,y l ,y 2 y n _ m ) T (16) 

The variables W ; ,7 = 1,2,. ..,777 are called state variables and the variables y n i = 1,2,..., (n — fri) are called decision 
variables. Now (15) can be rewritten using (16), as follows: 



X 



df(X) 

dw. 



d Wi + jr 









= Vf(X) 



(17) 



X 



dg ,(X) 



dw ■ 



n-m 

+ x 



dg j(X) 

dy, 



dy , = o 



j = 1,2 



(18) 



Suppose now that the d\ i , i = 1,2,..., (n — m ) are given arbitrary values. When these are substituted into unique 

values for the dw n i = 1,2,..., 777 can be found which keep X+h inside the feasible region. One can then use all these 
values in (17) to see if 

df(X)>0 



i.e., the new point X+h is an improvement over X. 

We now state the explicit steps needed to carry this out using vector notation. The matrix 



3*i 


3*i 


3*i 


5wj 


dw 2 








dg 2 


3w, 


dw 2 




dg m 


dg m 


dg m 




dw 2 





Is called the Jacobian matrix, and the matrix 



3*i 


dg, 


dg , 


dy, 


dy 2 


dy„- m 


dg 2 


dg 2 


dg 2 


3y, 


dy 2 


dy„- m 


dg m 


dg m 


dg m 


dy, 


dy 2 


dy„- m 



Is called the control matrix. It is important in defining the state and decision variables that the left-hand sums in 
(17) and (18) be linearly independent. It is always possible to make a choice of which X ; ’s become state variables. So this 

happens because we have assumed that the equations in (15) are linearly independent. The implication of this is that J is 
nonsingular. Now let 

W =(^,^2 w m f 
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Y = (y l ,y 2 ,....y n - m ) 

Then 

v w f T dW +V w f T dY =df(W,Y) 

And 

JdW + CdY = 0 

Respectively. As J is nonsingular, we can multiply (20) by / 1 . 

dw = -J~ l CdY 



(19) 



( 20 ) 



( 21 ) 



It can be seen, that if the elements in dY are given values, dW can be calculated using (21). Substituting this 
into (19) yields 



df(W,Y) = (y y f T -V w f T J~ l C)dY 

From (22) we can form what is known as the constrained gradient of f with respect to Y, which is 

v;/ = dc { (w ’ y) =v y / T -vj T j~ l c 



( 22 ) 



3 e y 



Each element of V\,f , namely 



(23) 



d C f . 






, i = 1,2,. ..(ft — /ft) is called a constrained derivative. It represents the 



rate of change of f resulting from perturbing X- from y ; (all other X ; ’s being held constant) to feasible points. 
When constrained derivatives are used, i.e. X* is a feasible maximum it is necessary that 

v;/(x*) = o 



(24) 



Equation (24) can be used identify all the stationary points; it remains to find which one is the global maximum. 
With the modification that H is the matrix of constrained second derivatives with respect to the independent variables 

y | , y 2 y n _ m only, and not W, , W 2 ,..., w m . The complete method will be illustrated with a numerical example. 

CONCLUSIONS 

There are two principal reasons for interest in nonlinear duality at this point:(l) for any feasible solutions to a pair 
of primal and dual nonlinear programming problems, the dual objective function value provides a limit on the value of the 
primal objective function (as with linear programs) and (2) for a pair of optimal solutions, the value of the dual variables 
may have the same kind of “shadow price” interpretation that we associate with the linear programming case-giving a 
possible marginal valuation to resource that are used up in the optimal solution and a value of zero to those resources that 
are in excess supply at an optimal solution. 
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